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Abstract 

In order to explore atomic asymmetry and molecular chirality in 2D space, benzenoids composed of 3 to 1 1 hexagons in 2D 
space were enumerated in our laboratory. These benzenoids are regarded as planar connected polyhexes and have no 
internal holes; that is, their internal regions are filled with hexagons. The produced dataset was composed of 357,968 
benzenoids, including more than 14 million atoms. Rather than simply labeling the huge number of atoms as being either 
symmetric or asymmetric, this investigation aims at exploring a quantitative graph theoretical descriptor of atomic 
asymmetry. Based on the particular characteristics in the 2D plane, we suggested the weighted atomic sum as the 
descriptor of atomic asymmetry. This descriptor is measured by circulating around the molecule going in opposite 
directions. The investigation demonstrates that the weighted atomic sums are superior to the previously reported 
quantitative descriptor, atomic sums. The investigation of quantitative descriptors also reveals that the most asymmetric 
atom is in a structure with a spiral ring with the convex shape going in clockwise direction and concave shape going in 
anticlockwise direction from the atom. Based on weighted atomic sums, a weighted F index is introduced to quantitatively 
represent molecular chirality in the plane, rather than merely regarding benzenoids as being either chiral or achiral. By 
validating with enumerated benzenoids, the results indicate that the weighted F indexes were in accordance with their 
chiral classification (achiral or chiral) over the whole benzenoids dataset. Furthermore, weighted F indexes were superior to 
previously available descriptors. Benzenoids possess a variety of shapes and can be extended to practically represent any 
shape in 2D space — our proposed descriptor has thus the potential to be a general method to represent 2D molecular 
chirality based on the difference between clockwise and anticlockwise sums around a molecule. 
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Introduction 

Many molecular properties are dependent on the molecular 
shape. When some proteins are active, these protein molecules 
may double over or twist into radically difTerent shapes. 
Electrophoresis is a technique that separates macromolecules 
according to their net electrical charge and shape. Methods related 
to the generation of shape signatures represent molecular shape, 
and use shape signatures in both ligand-based and receptor-based 
molecular design [1]. A novel ligand-based virtual screening 
method combines shape and electrostatic information into a single, 
unified framework [2]. 

Consequendy, studies on molecular shapes and quantities have 
been performed [3] -[9]. Generally, the shape of a molecule refers 
to its surface area in ordinary 3D space. However, some studies on 
molecules or superstructures are limited to motion on metal 
surface or in other 2D space, for example, displacements along 
surfaces of metallic catalysts can be regarded approximately as 
motions along a plane [10]. Thus, the investigation of molecular 



shape in 2D space is also necessary, including 2D chirality 
descriptions. 

Since the molecular plane is automatically a mirror plane, a 
chiral object in 2D space is an achiral object in 3D space [1 1], that 
is, the chirality descriptors in 3D space cannot be direcdy applied 
to 2D chirality. Therefore, it's necessary to perform the special 
development of chirality descriptors in 2D space. Several studies 
on the degree of 2D chirality have been implemented. Buda and 
Mislow developed a simple method to measure the degree of 
chirality of a triangle, which is the departure between a pair of 
enantiomers of triangles [12]. Zabrodsky and Avnir suggested a 
continuous chirality measure (COM) to represent the degree of 
shape chirality. CCM was developed based on the minimal 
distances that the vertices of a shape must move in order to attain 
the nearest achiral symmetry point group. CCM was mainly 
developed for molecules in 3D space, but the degree of chirality 
can also be applied to the objects in 2D space, if the reflection 
plane in 3D was replaced by a reflection line in 2D [13]. 
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Randic approached a descriptor of 2D molecular chirality of 
benzenoids, called F index, which was derived from atomic 
asymmetry. The asymmetry of an atom, called atomic sum, was 
derived from the difference between clockwise and anticlockwise 
binary codes of the atom [14]. Randic etc. also performed some 
researches of basic theories of binary codes, for example, two 
theorems were proven concerning the relationships between 2D 
chiral classification and binary codes of benzenoids [15]-[16]. 
These codes exhibit four major shortcomings: 1) the degeneracy of 
atomic sums was serious; 2) the contribution to atomic asymmetry 
is not related to the distance between two atoms; 3) the degeneracy 
of the F index was obvious due to the degeneracy of atomic sums; 
and 4) the atomic sums and F indexes were evaluated by only a few 
examples. We decided to evaluate their ability to identify 2D chiral 
benzenoids in an exhaustive set. The results showed that a number 
of atomic sums were not in accordance with the classification of 
the atoms (asymmetric atoms or symmetric atoms), some F indexes 
were not in accordance with the chiral classification of the 
benzenoids, and some quantitative representations are counterin- 
tuitive (see Section results and discussion). 

It's a bonus for the molecules in 2D space that atomic 
asymmetry can be represented by surrounding the whole molecule 
in different directions (clockwise or anticlockwise) in a plane. In 
contrast, it is difiBcult to implement this kind of method in 3D 
space. If the four shortcomings of Randic's method can be 
significandy overcome, it should become an indispensable method 
for the description of atomic asymmetry in 2D space, and a 
valuable complement to the 2D chemical graph. 

Our studies on the degree of chirality of benzenoids tried to 
exploit a practical method to represent 2D chirality based on the 
pioneering work of Randic. This investigation aimed at exploring 
the improved descriptors, in their ability to overcome the observed 
limitations of the Randic's descriptors, and in their ability to 
represent the atomic asymmetry from the symmetrical atoms to 
the most asymmetric atoms, as well as in their ability to represent 
the molecular chirality from the achiral molecules to the most 
chiral molecules. 

In detail, by introducing distance factor, we improved the 
method with a weighting scheme based on distances, and 
suggested weighted atomic sums to represent atomic asymmetry 
[17]; by summing the contributions of weighted atomic sums, we 
suggested a weighted F index to represent the degree of molecular 
chirality [18]. In this paper, the two descriptors are tested with 
more than 350 thousand benzenoids enumerated in our laboratory 
with an in-hous(' program. 

The shapes of benzenoids can also be regarded as shapes of 
graphite and graphene, and have potential to approximately 
represent similar shapes [19]. 

Methodologies 

In this paper, we limit our interest to 2D space. If no specific 
declaration is made, the investigation of benzenoids was only 
performed in the 2D space. If a molecule cannot be translated to 
its mirror image in a plane as shown in Figure 1, the molecule is 
chiral in 2D space. Thus, benzenoids can be classified as chiral 
molecules and achiral molecules in 2D space. We would 
investigate the qualitative classification and quantitative measure- 
ment of molecular chirality. 

The descriptors of molecular chirality of benzenoid embedded 
in a plane were calculated by the following three steps: 1) 
generation of binary codes of periphery; 2) generation of 
descriptors of atomic asymmetry; 3) generation of descriptors of 
molecular chirality. 



1. Binary codes of periphery 

Each benzenoid can be represented by binary codes [20] . As an 
example, atom numbers and binary codes of a benzenoid M and 
its enantiomer (OM) in 2D space are shown in Figure la and 
Figure lb, respectively. The rules to generate the binary codes are 
briefly introduced as follows: 

1) Only the atoms in the periphery are coded. The external 
vertices of the benzenoid are coded by 0 or 1, and the internal 
vertexes are neglected. 

2) The 1 is assigned to single-ring vertex and 0 is assigned to 
ring-fusion vertex. 

A molecular code can be obtained by gathering the binary codes 
together as a line notation. Starting from any atom, a clockwise 
molecular code is obtained by clockwise reading of the binary 
codes, and the anticlockwise code is obtained by anticlockwise 
reading of the binary codes. The clockwise molecular code starting 
with atom 1 of benzenoid M is: 

0001101110110101101111 

There are 22 atoms along the contour of M, so the code length 
is 22. The anticlockwise molecular code of M, also starting with 
atom 1, is: 

0111101101011011101100 

Obviously, the clockwise and anticlockwise molecular codes are 
different, although starting from the same atom (atom 1). 

Also, the anticlockwise molecular code of OM starting with the 
atom 1 is: 

0001101110110101101111 

Evidently, the molecular code is the same as the clockwise 
molecular codes of M. Thus, characteristic #1 of binary 
codes can be obtained that the clockwise molecular code 
starting with an atom in a benzenoid is equivalent to the 
anticlockwise code starting with the same atom in the 
mirror of the benzenoid. 

If starting with atom 2, the clockwise molecular code of M is: 

0011011101101011011110 

Obviously, this result is different from the clockwise molecular 
code starting with atom 1. This means that the molecular code 

depends on the starting atom. 

2. The representation of atomic asymmetry in two- 
dimensional space 

If the clockwise binary codes of an atom are the same as the 
anticlockwise binary code of the atom, the atom lies in a 
symmetric environment and is a symmetric atom. If the clockwise 
binary code and anticlockwise binary code of an atom are 
different, the atom is in an asymmetric environment and it is an 
asymmetric atom. Although a benzenoid can be represented by 
binary codes, in which binary codes starting from an atom can be 
used to qualitatively judge the asymmetry of the atom, the binary 
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Figure 1. Atomic asymmetry of a pair of enantiomers IW and OM. a) atom numbers; b) binary codes; c) atomic sums; d) weighted atomic 
sums. 
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codes themselves can't be directly used to quantitatively describe 
atomic asymmetry. Therefore it is necessary to develop quantita- 
tive indexes of atomic asymmetry, such as atomic sums suggested 
by Randic and weighted atomic sums in this paper. An idea 
suggested by Randic in the literature [14] was also adopted. This 
idea is that if an atom is asymmetric, the index of its asymmetry 
should not be zero, which represents the degree of deviation from 
the symmetry; at the same time; if an atom is symmetric, the index 
of its asymmetry should be zero. 

2.1 Atomic sums. The atomic sums have been suggested by 
Randic as representation of atomic asymmetry in 2D space [14], 
and are briefly introduced here. The clockwise (ci) and anticlock- 
wise (fli) molecular codes starting from atom 1 of M are listed in 
Row 2 (count from Table header) and Row 4 of Table 1, 
respectively, which were used to represent atom 1. Similarly, the 
clockwise and anticlockwise molecular codes starting from atom 2 
can be used to represent atom 2. So can atom 3, and so on. 

The clockwise partial sums of an atom are obtained from 
clockwise binary codes of the atom by adding at each site all 
entries preceding that site. The clockwise partial sums of atom k 
{Ckj can be expressed by equation: 



7 = 1 



where C/,; is the «th element of Ck and Ckj is the jth element of 
clockwise binary codes. For example, the partial sums of atom 
l(Ci) are listed in Row 3. Similarly, the anticlockwise partial sums 
of atom k [Ak) are; 



where A^i is the ith element of and akj is the jlh element of 
anticlockwise binary codes. For example, the partial sums of 
anticlockwise binary codes (^i) are hsted in Row 5. 

In order to observe the difference between the clockwise and the 
anticlockwise sequences, the anticlockwise partial sums are 
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subtracted from the clockwise partial sums (Ci- Ai), and the 
sequence of the difference is listed in Row 7 .The sum of the entries 
of Row 7, —34, is the so-called atomic sum. Row 7 {Ci-A{j in 
Table 1, 0, -1, -2, -2, -2, -2, -2, -2, -1, -2, -1, -1, 
-2, -1, -2, -2, -2, -2, -2, -2, -1, 0, is a vector. It can be 
found that the number in bold and the number in light is 
symmetric, thus, atomic sum can be expressed by using half of the 
vector. As a result, the atomic sum is also the half of the original 
value. In this paper, the atomic sum of atom k is defined as: 

n/2-l 

1=1 

where denotes atomic sum of atom k; Cy is variable i oiCk', A^i 

is variable i oi Ai^\ n is the number of atoms of contour. If 

i = d (distance to atom k). Thus, i can be replaced by d in equation 

above. 

n/2-l 

4= E ^Cu-Au) (1) 

d=l 

If the elements of row 7 in Table 1 are put into equation (1), the 
result is: 

10 

s\=Y.(Cu-Au) 

d=\ 

= -l+(-2) + (-2) + (-2) + (-2) + (-2) 
+ (-2) + (-l) + (-2) + (-I)=-17 

The atomic sum is a measure of the asymmetry of atomic 
environment of an atom as obtained by circulating around the 
molecule going in opposite directions (derived from the difference 
between the clockwise binary codes and anticlockwise binary 
codes), i.e., — 17 represents atomic asymmetry of atom 1 of M. AU 
the atomic sums of M and its enantiomer OM were obtained and 
are shown in Figure Ic. 

For OM, the anticlockwise molecular code starting from atom 1 
is: 0001101110110101101111, i.e., this code is just the clockwise 
molecular codes of M (see before), that is, C i of OM will be equal 
to ^ 1 of M. Thus, atomic sum of atom 1 of OM wiU be equal in 
magnitude and opposite in sign of atomic sum of atom 1 of M, i.e., 
atomic sum of atom 1 of OM is 17. This feature of atomic sum of 
atom 1 can be extended to any atom in benzenoids, such as atomic 
sums of atom 4 of M and OM were 14 and — 14. In other words, 
the atomic sums of the corresponding atoms of a pair of 
enantiomers are each other's equal in magnitude and 
opposite in sign. This is the characteristic #1 of atomic 
sums. 

Some planar benzenoids confined to 2D space are chiral, such 
as the enantiomers in Figure 1; and the others are achiral, such as 
an anthracene being translated into a benzenoid in 2D space. The 
reason is that they can be overlapped with its mirror in 2D space. 
There are self-superimposing reflection lines (symmetric axes) in 
achiral benzenoids in 2D space as introduced in the literature [16], 
for example, in the achiral benzenoids of Figure 2 there are two 
reflection lines represented by double headed arrows. It is obvious 
that clockwise binary codes of an atom are the same as 
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Figure 2. Atomic asymmetry of an achirai benzenoid. The reflection line of molecule was represented by double headed arrow, a) atom 
numbers; b) binary codes; c) atomic sums; d) weighted atomic sums. 
doi:1 0.1 371 /journal.pone.01 02043.g002 



the anticlockwise binary codes of its mirror atom about 
a reflection line in achirai benzenoid, and vice versa. 
This is the characteristic #2 of binary codes. For 

example, atom 4 and atom 5 in Figure 2 possess this characteristic. 

Atom numbers, binary codes and atomic sums of an achirai 
benzenoid with a symmetry axis are displayed in Figure 2a, 2b and 
2c, individually. The clockwise binary code of atom 4 is 
"11011101101110" and anticlockwise binary code is 
"10111011011101", i.e., the clockwise and anticlockwise binary 
codes beginning with the same atom, 4, are different. Thus, atom 
4 is an asymmetric atom. It indicates that there are asymmetric 
atoms in achirai benzenoids. According to equation (1), the atomic 
sum of atom 4 is 2. It can be found from Figure 2c that atom 4 and 
atom 5 are symmetric about the vertical double headed arrow and 
their atomic sums are symmetrical, i.e., 2 and —2. Similarly, this 
can be extended to all the atoms in achirai benzenoids in 2D 
space. According to characteristic #2 of binary codes (clockwise 
binary codes of an atom are the same as the anticlockwise binary 
codes of its mirror atom about a reflection line in achirai 
benzenoid, and vice versa) and equation (1), we can get the 
characteristic 9^2 of atomic sum: the atomic sum of an 
atom is equal in magnitude and opposite in sign of its 
mirror atom about self-superimposing reflection line. 

The clockwise and anticlockwise codes of any symmetric atom 
are the same, thus, the atomic sum of any symmetric atom 
is zero based on equation (1). This is characteristic #3 of 
atomic sums. As an example of atom 1 in Figure 2 on the 
horizontal reflection line, both clockwise and anticlockwise binary 
codes are "11011011101101", thus atom 1 is a symmetric atom 
and its atomic sum is zero. 

2.2 Weighted atomic sums. On the basis of atomic sums, 
we have suggested a weighted atomic sum for the representation of 
atomic asymmetry in 2D space by introducing a distance factor 
into equation (1). The basic idea of weighted atomic sums is that 
the farther two atoms are, the less they contribute to the 
asymmetry of each other. The weighted atomic sum of atom k is 
defined as 



«/2-l 



internal atoms and bonds were not involved in the calculation of 
atomic sums and weighted atomic sums, i.e., the atomic 
asymmetry was only derived from the contour of molecular 
periphery represented by binary codes. 

As an example of atom 1 of M, the elements of row 6 and row 7 
in Table 1 are put into equation (2), and the result is: 



4= 



{Ckd-Akd)/d 



(2) 



= -1/1-2/2-2/3-2/4-2/5-2/6-2/7-1/ 
8-2/9-1/10 
= -4.63 



The only difference between weighted atomic sums and atomic 
sums is the introduction of the distance factor. Thus the distance 
factor d doesn't change the three features of atomic sums 
mentioned above, that is, the weighted atomic sums as represen- 
tation of atomic asymmetry in 2D space have tiie same three 
characteristics as atomic sums: 

(1) For a chiral benzenoid, e.g., weighted atomic sum of atom 1 
in OM is 4.63 and is equal in magnitude and opposite in sign 
of the weighted atomic sum (—4.63) of atom 1 in M. 

(2) For an achirai benzenoid, e.g., the weighted atomic sums of 
atom 4 and atom 5 are 1.20 and —1.20 in Figure 2d. 

(3) The weighted atomic sum of a symmetric atom is zero 
because the clockwise binary codes of a symmetric atom is the 
same as its anticlockwise binary codes, such as atom 1 in 
Figure 2d. 



3. Molecular chirality of benzenoids in 2-dimensional 
space 

3.1 F index. The F index has been suggested by Randic to 
represent molecular chirality based on atomic sums. The index 
was obtained by extracting the contributions of atomic asymme- 
tries of the contour of a benzenoid. The F index is defined as a 
sum of ^th powers of atomic sums: 



where s\- is the weighted atomic sum of atom k; d denotes the 
distance (in number of bonds along the molecular contour) to atom 
k; Ckd - Akd is the difference of clockwise partial sum and 
anticlockwise partial sum when the distance to atom k is d;n is the 
number of atoms in the molecular contour. It is emphasized that 



" 4?o 

Z — ^ u 



: 3,5,7,- 



(3) 
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where, denotes the F index; represents atomic sum of atom 
k; n is the number of atoms of the contour, which is used to 
normalize the index; q is an odd power, such as 3, 5, 7 (if no 
specific declaration, the default value of q is "3" in this article). 

As an example of M, its F index is obtained by putting the 
atomic sums of Figure Ic into equation (3): 

22 40 4 22 



k=\ 
4 



A-=l 

3 XX rc 1-7^3 ,/ ^^3 , /, 1x3 1 /l/l\3 , //;^,3 , /qn3 , /,o\3 



= ( X [(- 17)^ + (- 3)^ + (11)^ + (14)^ + (6)^ + (9)^ + (12) 

+ (4)3 + ( _ 4)3 + ( - 1 )3 + (2)3 + ( - 6)3 + ( - 3)3 + (0)=' + (3)3 
+ (6)3 + ( - 2)3 + (1)3 + (4)3 + ( - 4)3 + ( - 12)3 + ( - 20)3] 
= -47.60 



For OM, its F index is: 



22 



22 



k=\ k=i 



= (^)'x[(17)3 + (3)3 + (~ll)3 + (-14)3+(-6)3 + (-9)3 

+ ( - 1 2)3 + ( - 4)3 + (4)3 + ( 1 )3 + ( - 2)3 + (6)3 + (3)3 + (0)3 

+(-3)3+(-6)3+(2)3+(-l)3+(-4)3 + (4)3 + (12)3+(20)3] 
= 47.60 



F indexes of M and OM are equal in magnitude and opposite in 
sign. Being similar with characteristic #1 of atomic sums, the 
characteristic of F indexes can be extended to all the pairs of 
enantiomers. F indexes of a pair of enantiomers are equal 
in magnitude and opposite in sign. This is the charac- 
teristic # 1 of F index. 

As an example of the achiral benzenoid in Figure 2, atomic 
sums of Figure 2c are put into equation (3), F index is: 



22 



4 



3 .X ^/-A^3 I / A\i , I 1\3 I /onS I / 0x3 I /,x3 I /^x3 



= (y^rx[(0)^+(-4)^ + (-l)^ + (2)^ + (-2)^' + (l)^ + (4)- 

+ (0)3+(-4)3+(-l)3+(2)3+(-2)3+(l)3+(4)3 



F index is zero, because the atomic sums of two atoms that are 
symmetric about any reflection line (as the double headed arrow 
shown in Figure 2) are equal in magnitude and opposite in sign, 
and each item in equation (3) is can( ck-d h}' its counter item. It can 
also be extended to all achiral benzenoids, that is, F indexes of 
all achiral benzenoids are zero. This is the characteris- 
tic #2 of F index. 

It has been discussed by Randic that if 5 is an even number, F 
indexes don't satisfy the two characteristics above. Thus, q is 



limited to odd number [14]. In addition, if is equal to 1, F index 
is zero for each benzenoid. Thus, q is not equal to 1 . 

3.2 Weighted F index. Similarly as atomic asymmetry, the 
index of a chiral object should not be zero, which represents the 
degree of deviation from achirality; at the same time, the index of 
an achiral object should be zero. Due to the low discrimination 
power, F indexes of some chiral benzenoids are zeros, which 
means these benzenoids are wrongly classified as achiral. In order 
to improve the discrimination power of the F index, we defined a 
weighted F index in which the atomic sums are replaced by 
weighted atomic sums. The definition of weighted F index is: 



(4) 



where, 5|represents the weighted atomic sum of atom h: n is the 
number of atoms of the contour; q is an odd power and can be 3, 
5, 7, and so on. Although weighted F index can be extended to be 
a vector, we took only a single value {q = 3) in this article because a 
single value of weighted F index already possesses enough 
discrimination ability. Just like F index, the weighted F index is 
the sum of the contribution of weighted atomic sums. 

Using M as an example, the weighted atomic sums of Figure Id 
are put into equation (4), and the weighted F index (F5') is: 



22 A 1 

— 4jt 



22 



^3 = E(l)'=(i)'E(4)' 
k=i k=i 

= ( ^ )^ X [( - 4.63)3 + ( - 0.58)3 ^3 43)3 ^ ^ 54^3 ^ 45^3 

+ (1 .68)3 + (3. 1 1)3 _^ (0.55)3 + ( - 1 ^g^3 ^ ^ _ 0 49)3 

+ (0.92)3 _^ ( _ 2. 12)3 + ( - 0 37)3 ^ (0.03)3 _^ (o.9 1)3 

+ (2. 15)3 + ( - 0.89)3 ^ (0.54)3 _^ (2.08)3 + ( - 0.35)3 

+ (-2.52)3 + (-5.25)3] 

= -0.76 



As an example of OM, the weighted atomic sums of Figure Id 
are also put into equation (4), and the weighted F index is: 



22 4„l 4 22 

^3 = E(t^^'=4>'E(4) 



4 



22' 



k=\ 

3 XX rc/l AQ\3 , /f, <;o\3 , / Q/n\3,/- t o/|\3 



= i^rx [(4-63)3 + (0.58)3 ^ ( _ 3 43)3 ^ ( _ 3 34). 
+ ( - 0.45)3 + ( - 1 .68)3 + ( _ 3 1 1)3 ^. ( _ 0.55)3 + (i 99)3 
+ (0.49)3 + ( - 0.92)3 + (2. 12)3 + (q g7)3 ^ ( _ o.o3)3 
+ (-0.91)3 + (-2.15)3 + (0.89)3 +(-0.54)3 +(-2.08)3 
+ (0.35)3 +(2.52)3 + (5.25)3] 
= 0.76 
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Figure 3. The examples of benzenoids composed of 1-6 benzene rings on a hexagonal lattice. 

doi:1 0.1 371/journal.pone.01 02043.g003 



It can be found that, just like the F index, weighted F indexes of 
M and OM are equal in magnitude and opposite in sign. 
Weighted F indexes of a pair of enantiomers are equal in 
magnitude and opposite in sign. This is characteristic 
if^\ of the weighted F index. 

As an example of the achiral benzenoid in Figure 2, 

22 4„1 4 22 

k=\ k=\ 
= (:^)' X [(0.00)' + (-1.95)^ + (-0.33)^ + (1.20)^ 

+ ( - 1 .20)' + (0.33)' + (1 .95)' + (0.00)' + (-1.95)' 
+ ( - 0.33)' + (1 .20)' + ( - 1 .20)' + (0.33)' + (1 .95)' 
= 0 

Just like in the F index, each item derived from an atom in 
equation (4) is canceled by its counter item derived from its mirror 
atom about self-superimposing reflection line, thus, the weighted 
F index of any achiral benzenoid is zero [14]. This is the 
Characteristic #2 of the weighted F index. 

Just like F index, if ^ is an even number, F indexes don't satisfy 
the two characteristics above. Thus, q is limited to odd number in 
this paper. 

4. Enumeration of benzenoids in 2D space 

Some examples of benzenoids are displayed in Figure 3. In this 
Figure, all the benzenoids composed of 1 to 4 hexagonal rings are 
displayed, including one benzenoid composed of 1 hexagon, one 
benzenoid composed of 2 hexagons, three benzenoids composed 
of 3 hexagons and ten benzenoids composed of 4 hexagons. At the 



same time, one benzenoid composed of 5 hexagons and one 
benzenoid composed of 6 hexagons are also shown in Figure 3. 

For testing our method, a dataset of benzenoids in 2- 
dimensional space was obtained by enumeration. The l)cnzc'noids 
in this paper were regarded as planar simply connected polyhexes 
and all internal regions of benzenoids were filled with hexagons, 
that is to say, the benzenoids have no internal holes [21]-[22]. 

The enumeration procedure developed in our laboratory is 
based on the number of given hexagons. For example, if we input 
5 as a number of hexagons, the program enumerates all 
benzenoids composed of 5 hexagons. In order to enumi;rate all 
the benzenoids composed of h (h>2) hexagons, the following 
procedure were implemented: 

Firsdy, an isosceles trapezoid was generated. As shown in 
Figure 4, an isosceles trapezoid region on a hexagonal lattice was 
used to enumerate benzenoids. Within the isosceles trapezoid 
region, each hexagon is denoted with a number, and the bases and 
legs of trapezoid can be easily represented by hexagon number, for 
example, the legs of trapezoid are represented by {1, 6, 10} and 
{5, 9, 12}. The length of the leg is the number of hexagons to be 
contained in a leg, i.e., the leg of trapezoid in Figure 4 is 3. 

For the enumeration of benzenoids composed of h hexagons, 
our investigations indicated that the suitable length of each leg (m) 
of the trapezoid is: 

where, h is the number of hexagons of each enumerated 
benzenoid; m is the integer part of the rjuotient of (2/i+l)/3. The 
lengths of the two bases are h and h-m+l. It is proven that this type 
of isosceles trapezoid in fact contains all of the benzenoids of size h 
in File SI. An example of the isosceles trapezoid drawn for 
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the achiral benzenoids and all the benzenoids in 2D, as well as all 
the benzenoids in 3D space are listed in Table 2. 2D achiral 
benzenoids can be identified by binary codes. It had been proved 
that a benzenoid is achiral if and only if its binary codes are cp- 
palindromic. If there are vertices i and j in a benzenoid, and 
clockwise binary codes starting with i is the same as the 
anticlockwise binary codes starting with j, the binary codes of 
the benzenoid is cp-palindromic [16]. 

From Table 2, it can be found that in 2D space three 
benzenoids composed of 3 hexagons (all of them are achiral), and 
ten benzenoids composed of 4 hexagons (four of them are 
achiral), etc. were obtained. In summary, 357,968 benzenoids in 
2D were enumerated, including 1,430 achiral benzenoids. The 
enumerated results in 3D agreed with that published in the 
literatures [23]. 

The chiral benzenoids in 2D space are not chiral molecules in 
3D space, i.e., a pair of enantiomers in 2D space become an 
achiral benzenoid in 3D. For example, M and OM are two chiral 
benzenoids in 2D space, but they become an achiral benzenoid in 
3D space, because OM becomes the same as M when OM is 
rotated 180° about an axis in the molecular plane. An achiral 
benzenoid in 2D is still an achiral benzenoid in 3D. Thus, if the 
number of benzenoids in 2D space is denoted hy N2d> the number 
of benzenoids in 3D is denoted by Njo and the number of achiral 
benzenoids in 2D is denoted by N2Da, then they satisfy the 
equation: 

N3D={N2D+N2Du)/2 (6) 

The numbers of benzenoids in 2D space can't be retrieved from 
the literature, but the correctness of these numbers can be proved 
indirectly, because N^d can be calculated with N2D and N2Da 
based on equation (6). If one or some benzenoids in 2D space was 
missed, the calculated N^d must be lower than A^j^, in the 
literature. It can be found that all calculated N3D by equation (6) 
are in accordance with N^o in the literature, and it means that no 
benzenoid in 2D space was missed, i.e., all the 2D benzenoids were 
enumerated. 

The binary codes of all the enumerated benzenoids are listed in 
File S2 and each binary code are followed by the positions of the 
corresponding benzene rings in the isosceles trapezoid region 
denoted by the hexagon numbers. 



Table 2. The numbers of the achiral benzenoids and al 


the benzenoids 


In 2D space, and the number of benzenoids in 3D space. 




The number of hexagons 


The number of benzenoids in 2D 


The number of achiral benzenoids In 2D The number of benzenoids in 3D 


3 


3 


3 


3 


4 


10 


4 


7 


5 


33 


11 


22 


6 


146 


16 


81 


7 


618 


44 


331 


8 


2,803 


67 


1,435 


9 


1 2,824 


186 


6,505 


10 


59,883 


289 


30,086 


11 


281,648 


810 


141,229 


Summary 


357,968 


1,430 


1 79,699 


doi:l 0.1 371 /journal.pone.Ol 02043.t002 
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Figure 4. An isosceles trapezoid on a hexagonal lattice. 

doi:1 0.1 371/journal.pone.01 02043.g004 

benzenoids composed ofh = 5 hexagons is shown in Figure 4. The 
lengths of both legs are ?« = [(2 x 5+ 1)/3J = 3, and the two bases 
are h — 5 and h-m+l = 5 — 3+1 = 3. All the benzenoids composed of 
h = 5 hexagons can be obtained by enumerating benzenoids in 
such a region. 

Each enumerated benzenoid can be represented by the 
combination of 5 hexagons in terms of hexagon numbers. Hence, 
the benzenoids composed of 5 hexagons can be represented by { 1 , 
2, 3, 4,5}, {1, 2, 3, 4, 6}, {1, 2, 3,4, 7} {1, 2, 3, 4, 8}, and so on. 

If benzenoids composed of 3 hexagons ih = 3) are enumerated, 
the length of any leg of the corresponding isosceles trapezoid is 
m = 2 based on equation (5) and the lengths of two bases are /i = 3 
and h—m+l=2. If the trapezoid region is represented by 
{1,2,3,6,7} in Figure 4, all the three benzenoids composed of 3 
hexagons can be obtained, and they are benzenoids {1,2,3}, 
{1,2,6} and {1,2,7}. 

The binary codes were derived from the atoms in the contour of 
benzenoids. In order to implement this, the program only 
considers the bonds on the periphery. If a bond only belongs to 
one benzenoid, the bond is on the periphery. If a bond belongs to 
two benzenoids, the bond is not on the periphery. No bond 
belongs to more than two benzenoids. This program was written 
using the C programming language in our laboratory. 

Dataset 

The benzenoids composed of 3 to 11 hexagons were enumer- 
ated by the method described in Methodology. The numbers of 
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Figure 5. An asymmetric atom possessing weighted atomic 
sum of zero. 

doi:1 0.1 371 /journal.pone.01 02043.g005 

The molecular chirality was investigated based on the dataset of 
357,968 benzenoids composed of 1,430 achiral benzenoids and 
357,968-1,430=356,538 chiral benzenoids in 2-dimensional 
space. The ratio of number of achiral benzenoids to number of 
all the benzenoids in 2D is 1,430/357,968 = 0.4%. In this work we 
put special emphasis on chiral benzenoids in 2D. In the periphery 



of these benzenoids, there are 14,635,1 16 atoms, which would be 
used to perform the studies on atomic asymmetry. 

Results and Discussion 

1. Assessment of atomic asymmetry descriptors 

As mentioned above, the asymmetry of a symmetric atom 
should be represented by zero, and the asymmetry of an 
asymmetric atom should be a nonzero value, which represents 
the degree of deviation from the symmetry. The rules oudined 
above would be utilized to assess atomic sums and weighed atomic 
sums in this paper. 

1.1 Assessment of the classification ability of atomic 
asymmetry descriptors. As mentioned before, the 14,635,116 
atoms of benzenoids in the dataset included 14,633,852 asym- 
metric atoms and 1,264 symmetric atoms. First the atomic sums of 
all the asymmetric atoms in 356,538 chiral benzenoids and 1,430 
achiral benzenoids of the dataset were calculated, and their ability 
to discriminate asymmetric and symmetric atoms was assessed. 
The results showed that the atomic sums of 275,022 asymmetric 
atoms were zeros, i.e., 1.9% of asymmetric atoms (~1 atom/52 
atoms) were not correcdy predicted. For example, the atomic sum 
of asymmetric atom 14 in Figure Ic is zero, which isn't in 
accordance with its asymmetry. In contrast, the weighted atomic 
sum of atom 14 in Figure Ic is 0.03. 



a 
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0.95 
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Figure 6. Comparison of sections Kiaving tKie same atomic sums for four benzenoids. 

doi:1 0.1 371/journal.pone.01 02043.g006 
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weighed atomic sum =7.88 weighed atomic sum =6.88 




weighed atomic sum =5.92 weighed atomic sum =4.63 weighed atomic sum =3 .28 



o The atom possessing the largest weighed atomic sum for different size of the benzenoids 

Figure 7. The atoms possessing the largest weighted atomic sums for the benzenoids of sizes 3~11, separately. The size of a 
benzenoid is the number of hexagonal rings contained in a benzenoid. 
doi:1 0.1 371/journal.pone.01 02043.g007 
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2. atomic sums 
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3. weighed atomic sums 
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weighed F index: -0.598850725211 
F index: -28.5619834711 
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1.18 
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S^^^~-Q;9^H§9'^-0.35 



0.16 




weighed F index: -0.605396036912 
F index: -28.5619834711 



Figure 8. An illustration of two benzenoids possessing the same F indexes vector. 

doi:1 0.1 371 /journal.pone.01 02043.g008 



Then the weighted atomic sums of all 14,633,852 asymmetric 
atoms in the dataset were calculated, and the values of 72 
asymmetric atoms were zeros, i.e., <0.0005% asymmetric atoms 
were wrongly predicted. Thus, weighted atomic sums are 
satisfactory to classify the atomic asymmetry in 2D. In Figure 5, 
an asymmetric atom possessing the weighted atomic sum of zero is 
illustrated, and its calculation is briefly shown as foUow. 



Y,{.Cid-Ayd)/d 



= 0+1/2 + 0-1/4-1/5 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 

+ 0 + 0 + 0 + 0-1/20 
= 0 
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1.2 Studies on quantitative atomic asymmetry. Because 
the atomic sums and weighted atomic sums of symmetrical atoms 
are all zeros, thus, only the asymmetric atoms were observed in 
this section. 

It has been observed that many atomic sums were degenerated 
to the same value with no apparent relationship to chemical 
intuition, such as the atomic sums of atom 2 and atom 13 of 
compound M were both —3; the atomic sums of atom 9 and atom 
20 were both —4; the atomic sums of atom 8 and atom 19 were 
both 4. But the cases of the weighted atomic sums of these six 
atoms were quite different from the atomic sums (see Figure Id), of 
which, the values for atoms 2, 13 were —0.58, and —0.87; for 
atoms 9, 20 were —1.99, and —0.35; for atoms 8, 19 were 0.55, 
and 2.08, respectively. Consequently, weighted atomic sums seem 
more reasonable than atomic sums to represent the atomic 
asymmetry of these atoms. 

It is also noted that when two similar benzenoids are 
superimposed, sometimes an overlapped stretch of the two 
molecules has the same atomic sums. An example of four 
compounds is given in Figure 6. In which. Figure 6a shows the 
atom number, atomic sums and weighted atomic sums of all the 
atoms; Figures 6b, 6c and 6d show the atomic sums that have the 
same local environment as shown in Figure 6a, although their 
asymmetric atomic environments were different. 

As above, the case of weighted atomic sum was quite dififerent 
from that of atomic sum. The weighted atomic sums of these 
atoms are also shown in Figure 6. No two values of weighted 
atomic sums are the same. Obviously, weighted atomic sums could 
reduce the degeneracy. The atoms lying in the same local 
environment have similar weighted atomic sums. 

It is interesting to observe the largest weighted atomic sums in 
different sizes of benzenoids, because this relates with some 
molecular "shape" (see below). Herein, the size of a benzenoid is 
the number of hexagonal rings contained in the benzenoid. 
According to the definition of weighted atomic sum in equation 
(2), the higher the difference between clockwise partial sums and 
anticlockwise partial sums of an atom, the larger the weighted 
atomic sum of the atom. The extreme case of the biggest atomic 
asymmetry is that all the clockwise binary codes were "1" and all 
the anticlockwise binary codes were "0". However, that case could 
not exist. The atom possessing the largest weighted atomic sum in 
the dataset is shown on the top left of Figure 7. 

In this figure, two arrows starting with the atom possessing the 
highest weighted atomic sums respectively show the clockwise and 
anticlockwise directions. The shape looks convex in the clockwise 
binary codes; and the shape looks concave in the anticlockwise 
binary codes as shown on the top of Figure 7 . The shape of the 
benzenoid is a spiral ring. In this research, the end including the 
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atom possessing the largest weighted atomic sum is called head; 
and the other end is called tail. 

The benzenoid above stated contains 1 1 benzene rings. Like 
that, the benzenoids composed of /« = 10 to 3 benzene rings can be 
obtained by cutting that benzenoid from the end of the tail, and 
the atoms possessing the largest weighted atomic sum are all in a 
spiral ring system. From above results, atomic asymmetry could be 
observed roughly. If an atom lies in a nearly symmetric 
environment, the value of its atomic asymmetry might be closed 
to zero; if an atom lies in an environment, and the clockwise shape 
is nearly convex as well as the anticlockwise shape is nearly 
concave, its atomic asymmetry is close to the maximum atomic 
asymmetry. 

2. Assessment of descriptors of molecular chirality 

2.1 Assessment of the classification ability of descriptors 
of molecular chirality. The conception has been mentioned 
previously, i.e., if the descriptor of a chiral benzenoid gives zero 
value, the benzenoid will be wrongly recognized as an achiral 
benzenoid, and "degeneracy" is called. F indexes (^ = 3 as default) 
of 2,958 out of 356,538 chiral benzenoids were zeros, i.e., 2,958 
chiral benzenoids were wrongly classified. 

If 9 = 3, 5 or 7, individually, then a 3-dimensional F index was 
obtained. The discriminatory ability of F indexes was increased, 
but there were still 530 chiral benzenoids to be zero vectors. 
However, weighted F indexes (5 = 3 as default) had no zero value. 
Therefore, all the chiral benzenoids in the dataset were correctiy 
classified based on weighted F indexes. 

2.2 Studies on quantitative molecular chirality. Some 
benzenoids possess the same F indexes, such as two benzenoids in 
Figure 8. For both benzenoids, the contributions of the four circled 
atomic sums to F indexes are zero values and the remaining 
atomic sums are the same, which are irrational as discussed in 
Figure 6. Thus, the ability of F indexes to represent chirality can 
be put into question. In contrast, the weighted F indexes of the two 
similar benzenoids are close (—0.599 and —0.605). Similarly, three 
benzenoids composed of eight benzene rings possessing the same F 
indexes are shown in Figure 9. 

It is interesting to observe largest weighted F index among the 
benzenoids size of 1 1 in Figure 10. It has two head ends that are 
the same as the head end in Figure 7. For benzenoids size of 10, 
the benzenoid possessing the largest weighed F index also has a 
two-head structure. 

Conclusion 

In this research, the weighted atomic sums were obtained by 
circulating around the molecule in opposite directions based on 
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-1,14 1.72 




1.94 -0.91 



wF=1 .851 729975846 wF=1 .8600781 84787 

Figure 1 0. The benzenoids possessing tKie largest weighted F indexes for benzenoids size of 1 0 and 11 . All the weighted atomic sums 
are listed. 

doi:1 0.1 371 /journal.pone.01 02043.g01 0 



the features in 2D space, and the weighted F indexes were derived 
from weighted atomic sums. The two indexes were tested by a 
large data set, and the results indicate that the two indexes here 
presented are superior to atomic sums and F indexes previously 
available in the literature. It is noted that due to a wide variety of 
shapes of chiral benzenoids, the weighted atomic sums as 
quantitative descriptors of asymmetry and weighted F indexes as 
descriptors of molecular chirality were suggested by authors, 
rather than merely labeling objects as being asymmetry, symme- 
try, chiral or achiral. 

The description of benzenoids might approximately be extend- 
ed to represent any shape in 2D space, if the shape is put on a 
hexagonal lattice. 

Although this study focused on pure graph theoretical aspects of 
objects in 2D space, this research might be used in QSAR work. 
For example, the weighted atomic sums have potential to be 
applied to NMR data prediction for two reasons: 1) both weighted 
atomic sums and NMR data are sensitive to local symmetry; 2) 



both of them follow the rule: the farther two atoms are, the less 
they contribute to each other. 

Supporting Information 

File SI The proof that any benzenoid composed of h 
benzene rings can be enumerated in a specific isosceles 
trapezoid. 

(PDF) 

File S2 The isosceles trapezoid regions used to enu- 
merate benzenoids composed of 3 to 1 1 benzene rings 
and all the benzenoids that have been enumerated. 

(ZIP) 
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